4 Link Community Data
4.1 Overview
Stuff
- items
4.2 Environment Setup
To replicate the codes & functions illustrated in this tutorial, you’ll need to have R and RStudio downloaded and installed on your system. This tutorial assumes some familiarity with the R programming language.
4.2.1 Packages used
We will use the following packages in this tutorial:
sf: to manipulate spatial datatmap: to visualize and create mapsunits: to convert units within spatial data
4.2.2 Required Inputs and Expected Outputs
Our inputs will be:
- a SHP zip code boundary file (“chicago_zips.shp”), a
- a CSV file with COVID case data from the city data portal, and
- a SHP file with the locations of our resources (“methadoneClinics.shp”),
We will calculate the minimum distance between the resources and the centroids of the zip codes, then save the results as a shapefile and as a CSV. Our final result will be a shapefile/CSV with the minimum distance value for each zip.
4.3 Load data
First we’ll load the update zip code dataset from a previous labs.
## tibble [85 × 10] (S3: sf/tbl_df/tbl/data.frame)
## $ ZCTA5CE10 : chr [1:85] "60501" "60007" "60651" "60652" ...
## $ GEOID10 : chr [1:85] "60501" "60007" "60651" "60652" ...
## $ CLASSFP10 : chr [1:85] "B5" "B5" "B5" "B5" ...
## $ MTFCC10 : chr [1:85] "G6350" "G6350" "G6350" "G6350" ...
## $ FUNCSTAT10: chr [1:85] "S" "S" "S" "S" ...
## $ ALAND10 : chr [1:85] "12532295" "36493383" "9052862" "12987857" ...
## $ AWATER10 : chr [1:85] "974360" "917560" "0" "0" ...
## $ INTPTLAT10: chr [1:85] "+41.7802209" "+42.0086000" "+41.9020934" "+41.7479319" ...
## $ INTPTLON10: chr [1:85] "-087.8232440" "-087.9973398" "-087.7408565" "-087.7147951" ...
## $ geometry :sfc_MULTIPOLYGON of length 85; first list element: List of 1
## ..$ :List of 1
## .. ..$ : num [1:671, 1:2] -87.9 -87.9 -87.9 -87.9 -87.9 ...
## ..- attr(*, "class")= chr [1:3] "XY" "MULTIPOLYGON" "sfg"
## - attr(*, "sf_column")= chr "geometry"
## - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA
## ..- attr(*, "names")= chr [1:9] "ZCTA5CE10" "GEOID10" "CLASSFP10" "MTFCC10" ...
## Reading layer `methadoneClinics' from data source `/Users/qinyun/Documents/opioid-environment-toolkit/methadoneClinics.shp' using driver `ESRI Shapefile'
## Simple feature collection with 27 features and 8 fields
## geometry type: POINT
## dimension: XY
## bbox: xmin: -87.7349 ymin: 41.68698 xmax: -87.57673 ymax: 41.96475
## geographic CRS: WGS 84
Next, we’ll load some new data we’re interested in joining in: Chicago COVID-19 Cases, Tests, and Deaths by ZIP Code, found on the city data portal here. We’ll load in a CSV and inspect the data:
## 'data.frame': 1860 obs. of 21 variables:
## $ ZIP.Code : chr "60603" "60604" "60611" "60611" ...
## $ Week.Number : int 39 39 16 15 11 10 11 12 13 14 ...
## $ Week.Start : chr "09/20/2020" "09/20/2020" "04/12/2020" "04/05/2020" ...
## $ Week.End : chr "09/26/2020" "09/26/2020" "04/18/2020" "04/11/2020" ...
## $ Cases...Weekly : int 0 0 8 7 NA NA NA NA NA NA ...
## $ Cases...Cumulative : int 13 31 72 64 NA NA NA NA NA NA ...
## $ Case.Rate...Weekly : int 0 0 25 22 NA NA NA NA NA NA ...
## $ Case.Rate...Cumulative : num 1107 3964 222 197 NA ...
## $ Tests...Weekly : int 25 12 101 59 6 0 0 1 3 4 ...
## $ Tests...Cumulative : int 327 339 450 349 9 0 0 1 4 8 ...
## $ Test.Rate...Weekly : int 2130 1534 312 182 14 0 0 85 256 341 ...
## $ Test.Rate...Cumulative : num 27853.5 43350.4 1387.8 1076.3 21.7 ...
## $ Percent.Tested.Positive...Weekly : num 0 0 0.1 0.1 NA NA NA NA NA NA ...
## $ Percent.Tested.Positive...Cumulative: num 0 0.1 0.2 0.2 NA NA NA NA NA NA ...
## $ Deaths...Weekly : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Deaths...Cumulative : int 0 0 0 0 0 0 0 0 0 0 ...
## $ Death.Rate...Weekly : num 0 0 0 0 0 0 0 0 0 0 ...
## $ Death.Rate...Cumulative : num 0 0 0 0 0 0 0 0 0 0 ...
## $ Population : int 1174 782 32426 32426 41563 1174 1174 1174 1174 1174 ...
## $ Row.ID : chr "60603-39" "60604-39" "60611-16" "60611-15" ...
## $ ZIP.Code.Location : chr "POINT (-87.625473 41.880112)" "POINT (-87.629029 41.878153)" "POINT (-87.620291 41.894734)" "POINT (-87.620291 41.894734)" ...
4.4 Clean & Merge Data
#COVID1 <- COVID[, c("`ZIP Code`", "`Percent Tested Positive - Cumulative`","Population")]
#head(COVID1)
head(chicago_zips)## Simple feature collection with 6 features and 9 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: -88.06058 ymin: 41.73452 xmax: -87.58209 ymax: 42.04052
## geographic CRS: WGS 84
## # A tibble: 6 x 10
## ZCTA5CE10 GEOID10 CLASSFP10 MTFCC10 FUNCSTAT10 ALAND10 AWATER10 INTPTLAT10 INTPTLON10
## <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr> <chr>
## 1 60501 60501 B5 G6350 S 125322… 974360 +41.78022… -087.8232…
## 2 60007 60007 B5 G6350 S 364933… 917560 +42.00860… -087.9973…
## 3 60651 60651 B5 G6350 S 9052862 0 +41.90209… -087.7408…
## 4 60652 60652 B5 G6350 S 129878… 0 +41.74793… -087.7147…
## 5 60653 60653 B5 G6350 S 6041418 1696670 +41.81996… -087.6059…
## 6 60654 60654 B5 G6350 S 1464813 113471 +41.89182… -087.6383…
## # … with 1 more variable: geometry <MULTIPOLYGON [°]>
Let’s merge the data using the zip code geographic identifier, “ZIP Code” field, to bring in the the Percent Tested Positive - Cumalative dataset.
## Simple feature collection with 6 features and 30 fields
## geometry type: MULTIPOLYGON
## dimension: XY
## bbox: xmin: -87.63396 ymin: 41.88083 xmax: -87.6129 ymax: 41.88893
## geographic CRS: WGS 84
## GEOID10 ZCTA5CE10 CLASSFP10 MTFCC10 FUNCSTAT10 ALAND10 AWATER10 INTPTLAT10 INTPTLON10 ZIP.Code
## 1 60601 60601 B5 G6350 S 934226 60682 +41.8856419 -087.6215226 60601
## 2 60601 60601 B5 G6350 S 934226 60682 +41.8856419 -087.6215226 60601
## 3 60601 60601 B5 G6350 S 934226 60682 +41.8856419 -087.6215226 60601
## 4 60601 60601 B5 G6350 S 934226 60682 +41.8856419 -087.6215226 60601
## 5 60601 60601 B5 G6350 S 934226 60682 +41.8856419 -087.6215226 60601
## 6 60601 60601 B5 G6350 S 934226 60682 +41.8856419 -087.6215226 60601
## Week.Number Week.Start Week.End Cases...Weekly Cases...Cumulative Case.Rate...Weekly
## 1 39 09/20/2020 09/26/2020 8 213 54
## 2 33 08/09/2020 08/15/2020 8 128 54
## 3 34 08/16/2020 08/22/2020 7 135 48
## 4 25 06/14/2020 06/20/2020 5 82 34
## 5 13 03/22/2020 03/28/2020 9 23 61
## 6 22 05/24/2020 05/30/2020 2 70 14
## Case.Rate...Cumulative Tests...Weekly Tests...Cumulative Test.Rate...Weekly Test.Rate...Cumulative
## 1 1451.4 202 4304 1376 29328.8
## 2 872.2 216 2303 1472 15693.4
## 3 919.9 240 2543 1635 17328.8
## 4 558.8 100 881 681 6003.4
## 5 156.7 39 79 266 538.3
## 6 477.0 92 617 627 4204.4
## Percent.Tested.Positive...Weekly Percent.Tested.Positive...Cumulative Deaths...Weekly
## 1 0.0 0.0 1
## 2 0.0 0.1 0
## 3 0.0 0.1 0
## 4 0.0 0.1 0
## 5 0.2 0.3 0
## 6 0.0 0.1 0
## Deaths...Cumulative Death.Rate...Weekly Death.Rate...Cumulative Population Row.ID
## 1 6 6.8 40.9 14675 60601-39
## 2 5 0.0 34.1 14675 60601-33
## 3 5 0.0 34.1 14675 60601-34
## 4 5 0.0 34.1 14675 60601-25
## 5 0 0.0 0.0 14675 60601-13
## 6 4 0.0 27.3 14675 60601-22
## ZIP.Code.Location geometry
## 1 POINT (-87.622844 41.886262) MULTIPOLYGON (((-87.63396 4...
## 2 POINT (-87.622844 41.886262) MULTIPOLYGON (((-87.63396 4...
## 3 POINT (-87.622844 41.886262) MULTIPOLYGON (((-87.63396 4...
## 4 POINT (-87.622844 41.886262) MULTIPOLYGON (((-87.63396 4...
## 5 POINT (-87.622844 41.886262) MULTIPOLYGON (((-87.63396 4...
## 6 POINT (-87.622844 41.886262) MULTIPOLYGON (((-87.63396 4...
## Classes 'sf' and 'data.frame': 1798 obs. of 32 variables:
## $ GEOID10 : chr "60601" "60601" "60601" "60601" ...
## $ ZCTA5CE10 : chr "60601" "60601" "60601" "60601" ...
## $ CLASSFP10 : chr "B5" "B5" "B5" "B5" ...
## $ MTFCC10 : chr "G6350" "G6350" "G6350" "G6350" ...
## $ FUNCSTAT10 : chr "S" "S" "S" "S" ...
## $ ALAND10 : chr "934226" "934226" "934226" "934226" ...
## $ AWATER10 : chr "60682" "60682" "60682" "60682" ...
## $ INTPTLAT10 : chr "+41.8856419" "+41.8856419" "+41.8856419" "+41.8856419" ...
## $ INTPTLON10 : chr "-087.6215226" "-087.6215226" "-087.6215226" "-087.6215226" ...
## $ ZIP.Code : chr "60601" "60601" "60601" "60601" ...
## $ Week.Number : int 39 33 34 25 13 22 27 35 24 11 ...
## $ Week.Start : chr "09/20/2020" "08/09/2020" "08/16/2020" "06/14/2020" ...
## $ Week.End : chr "09/26/2020" "08/15/2020" "08/22/2020" "06/20/2020" ...
## $ Cases...Weekly : int 8 8 7 5 9 2 3 13 6 NA ...
## $ Cases...Cumulative : int 213 128 135 82 23 70 89 148 77 NA ...
## $ Case.Rate...Weekly : int 54 54 48 34 61 14 20 89 41 NA ...
## $ Case.Rate...Cumulative : num 1451 872 920 559 157 ...
## $ Tests...Weekly : int 202 216 240 100 39 92 132 254 93 6 ...
## $ Tests...Cumulative : int 4304 2303 2543 881 79 617 1161 2797 781 7 ...
## $ Test.Rate...Weekly : int 1376 1472 1635 681 266 627 900 1731 634 41 ...
## $ Test.Rate...Cumulative : num 29329 15693 17329 6003 538 ...
## $ Percent.Tested.Positive...Weekly : num 0 0 0 0 0.2 0 0 0.1 0.1 NA ...
## $ Percent.Tested.Positive...Cumulative: num 0 0.1 0.1 0.1 0.3 0.1 0.1 0.1 0.1 NA ...
## $ Deaths...Weekly : int 1 0 0 0 0 0 0 0 0 0 ...
## $ Deaths...Cumulative : int 6 5 5 5 0 4 5 5 5 0 ...
## $ Death.Rate...Weekly : num 6.8 0 0 0 0 0 0 0 0 0 ...
## $ Death.Rate...Cumulative : num 40.9 34.1 34.1 34.1 0 27.3 34.1 34.1 34.1 0 ...
## $ Population : int 14675 14675 14675 14675 14675 14675 14675 14675 14675 14675 ...
## $ Row.ID : chr "60601-39" "60601-33" "60601-34" "60601-25" ...
## $ ZIP.Code.Location : chr "POINT (-87.622844 41.886262)" "POINT (-87.622844 41.886262)" "POINT (-87.622844 41.886262)" "POINT (-87.622844 41.886262)" ...
## $ geometry :sfc_MULTIPOLYGON of length 1798; first list element: List of 1
## ..$ :List of 1
## .. ..$ : num [1:183, 1:2] -87.6 -87.6 -87.6 -87.6 -87.6 ...
## ..- attr(*, "class")= chr [1:3] "XY" "MULTIPOLYGON" "sfg"
## $ COVIDCaseRt : num 1451 872 920 559 157 ...
## - attr(*, "sf_column")= chr "geometry"
## - attr(*, "agr")= Factor w/ 3 levels "constant","aggregate",..: NA NA NA NA NA NA NA NA NA NA ...
## ..- attr(*, "names")= chr [1:31] "GEOID10" "ZCTA5CE10" "CLASSFP10" "MTFCC10" ...
## Reading layer `mclinicarea' from data source `/Users/qinyun/Documents/opioid-environment-toolkit/data/mclinicarea.shp' using driver `ESRI Shapefile'
## Simple feature collection with 1 feature and 1 field
## geometry type: POLYGON
## dimension: XY
## bbox: xmin: 1136699 ymin: 1818770 xmax: 1201240 ymax: 1941031
## projected CRS: NAD83 / Illinois East (ftUS)